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Abstract 


A  network  model  is  defined  in  terms  of  simple  relationships  among 
data.  The  hierarchical  and  network  approaches  to  DBMS  architecture  are 
outlined  using  the  model.  A  network  language  is  proposed  to  declare 
and  manipulate  access  paths  between  data.  It  is  claimed  that  the 
proposed  framework  can  be  appropriate  as  a  basis  for  relation  implementation. 
Finally,  some  implementation  problems  for  building  access  paths  are  discussed. 
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•  Introduction 


h  relation  as  a 
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representing  traditi 
can  be  implemented  on 
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implementation.  If 
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relations,  then  t 
inefficient  and  hard 


data  structure  is  a  very  basic  concept 
tance,  relations  resemble  the  record 
syl  DBTG],  They  can  also  be  thought  of 
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he  resulting 

system 
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oth 

to  implement. 


One  of  the  important  considerations  in  the 
implementation  of  a  relational  system  is  distinguishing 
among  the  different  roles  of  relations.  At  different  levels 
of  the  system  relations  play  different  roles;  a  file  is  a 
relation,  a  record  type  is  a  relation,  a  system  table  is  a 
relation,  etc.  It  seems  circular  that  relations  are  used  to 
implement  relations.  However,  this  circularity  is  only 
superficial.  Clearly,  operations  allowed  on  files  are 
different  than  the  operations  allowed  on  record  types,  or 
user  oriented  relations.  We  can  extend  a  file,  or  order  a 
record  type,  or  join  a  relation.  But  we  cannot  sort  a 
conceptual  relation  since  there  is  no  order  involved  [Codd 
1970].  One  should  distinguish  different  data  structures  not 
only  according  to  structure,  but  also  according  to 
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operations  allowed  on  them.  The  power  and  elegance  of  the 
relational  model  comes  mainly  from  the  operators  allowed  on 
relations,  g. »  join,  division,  etc.  [Codd  1970  ].  We  will 
discuss  mechanisms  to  implement  such  operators. 

ht  the  user  level  we  distinguish  between  two  kinds  of 
relations:  Erimar^  and  derived.  A  primary  relation  is  one 
which  has  a  generic,  independent  existence,  A  derived 
relation  is  one  which  is  defined  by  a  sequence  of  operations 
from  existing  relations.  A  derived  relation  totally  depends 
on  the  existing  primary  relations.  This  distinction  should 
not  carry  any  connotations  of  implementation.  One  should 
not  think  that  primary  relations  are  necessarily  implemented 
as  separate  files  and  that  derived  relations  are  necessarily 
implemented  as  cross  links  between  files.  A  primary 
relation  may  actually  be  implemented  using  pointers  and  data 
pools,  while  a  commonly  used,  rather  static,  derived 
relation  can  be  implemented  on  a  separate  file. 


Consider  a  system  that  supports  a  set  of  primary 
relations  and  the  derived  relations  which  can  be  obtained  by 
the  relational  algebra  [Codd  1970  ].  While  implementing  this 
facility  we  are  faced  with  the  following  problem.  How  can 
we  avoid  repeatedly  regenerating  the  same  derived  relations, 
especially  ones  that  are  commonly  used?  For  instance,  a  user 
may  want  to  derive  a  relation  which  later  becomes 
independent.  That  is,  it  is  no  longer  defined  in  terms  of 
primary  relations;  subsequent  updates  of  the  primary 
relations  do  not  affect  the  derived  relation.  It  represents 
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^  data  base  at  a  particular  time.  On  the 
other  hand,  a  derived  automatic  relation  will  always  agree 
with  the  current  data  stored  in  the  primary  relations  that 
define  it.  Consider  the  user-defined  derived  relation  R 
obtained  from  seme  opera-*-ions  on  existing  relations.  The 
system  may  choose  to  save  R  or  some  of  the  information  used 
to  obtain  R.  If  P  is  automatic,  it  should  always  agree  with 


what  one  would  obtain  by  redefining 
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us 

to 

construct  and  sometimes  preserve  these  structures  in  a  fast 
and  natural  way.  We  will  outline  a  facility  to  serve  this 
purpose. 

The  facility  which  we  propose  can  be  thought  of  as  an 
intermediate  level,  or  virtual  machine,  which  can  be  used  to 
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to  obtain  a  solid  framework  for  relation  implementation.  At 
the  same  time  we  will  try  to  isolate  the  set  of  operations 
on  networks  which  are  relevant  for  relation  implementation. 
In  this  manner  some  of  the  options  and  richness  of  the  DBTG 
proposal  that  create  problems  can  be  avoided. 

2.  Links 

Many  authors,  when  discussing  data  management,  rely 
heavily  on  the  idea  of  1:N  or  N:M  relationships  among  data 
[Bachman  196B].  We  will  define  such  a  relationship  and 
explore  its  properties. 

Consider  a  set  of  record  types  denoted  by  <X,Y,Z,...>. 
Within  each  record  type  we  will  denote  its  fields  (or  data 
items,  or  data  elements)  by  <A,B,C, ...>.  Record  occurrences 
will  be  denoted  by  <x,y,z,...>  and  data  item  values  by 
<a,b,c,...>.  The  notation  is  similar  to  reasonably  accepted 
terminology  [Codasyl  DBTG].  Record  types  correspond 
essentially  to  collection  of  records  of  the  same  type,  as 
they  implement  for  instance  primary  relations.  We  will  not 
discuss  how  record  types  are  chosen.  We  expect  them  to 
abide  by  reasonable  constraints  such  as  third  normal  form 
[Codd  1971],  Choice  of  record  types  is  a  decision  reached 
at  another  level  of  the  system.  We  will  provide  the 
mechanisms  to  represent  such  decisions  rather  than  trying  to 
make  them  at  this  level. 

^  iilLK  between  two  record  types  X  and  Y  is  a  mapping 
between  records  of  type  X  and  records  of  type  Y.  A  link  Lxy 
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from  X  to  Y  will  connect  each  record  x  in  X  to  a  set  of 
records  y  (possibly  empty)  of  type  Y.  Each  link  has  by 


definition  an 

inverse  link.  The 

inverse 

of  Lxy 

is 

a 

mapp 

ing 

from  Y  to  X. 

For  each  record  y 

of  type 

Y  the  inver 

3  0 

of 

Lxy 

will  define  a 

set  of  records  x. 

namely , 

the  set 

of 

re 

cord 

S  X 

which  are  mapped  by  Lxy  to  y. 

A.  link  defines  an  N:K  relationship  among  records  of  type 
X  and  Y.  It  is  similar,  but  not  identical,  with  the  idea  of 
an  owner-coupled  set  [Codasyl  DBTG  ].  We  will  point  out  the 
differences  in  section  3.  Note  that  links  can  be  created 
from  X  to  X.  It  is  perfectly  reasonable  and  often  useful. 

A.  link  can  be  declared  to  be  either  automatic  or  manual. 


An  automatic 

link 
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between 
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two 

different  types 
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ssed  by  a 

r  elationship 

between  their  data  ite 

ms. 

Such  a  link 

provides 

only  an  access  path  and  carries  no  additional  structural 
information.  For  instance  consider  a  predicate  W  which 
relates  data  items  A,B,C,  of  type  X  and  data  items  D,E,F  of 
type  Y.  An  automatic  link  according  to  W  will  link  records 
of  type  X  and  records  of  type  Y,  such  that  the  values  of 


their  data  elements 

satis 

f  y  W . 

If,  for 

example,  W 

was  (A- 

D)  (B-E)  (C-F)  =0  then 

each 

X  of  X 

will  be 

linked  with 

all  y  in 

Y  which  have  either 

a  =  d. 

or  b=e 

,  or  c=f. 

A  manual  link  Lxy  is  defined  according  to  specific 
explicit  operations  which  link  a  record  x  of  type  X  to 
another  record  y  of  type  Y.  In  this  manner  after  a  record  x 
is  inserted  it  can  be  manually  attached  to  one  or  more 
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which  have  been  defined  between  them.  We  declare  each 
record  type  in  the  usual  way. 

DECLABF  RECORD  <record__name> 

ITEM  <item_name>  TYPE  <typG  narae> 

ITEM  <item_name>  TYPE  <type  name> 


A  link  is  defined  either  automatic  or  manual, 


DEFINE  LINK  <link__name>  BETWEEN  <record_name>  AND 
<rGcord__na  me>  AUTOMATIC  <predicate_def  inition> 

DEFINE  LINK  <link_narae>  BETWEEN  <recor d_name>  AND 
<record__name>  MANUAL 

After  links  are  defined  they  can  be  created .  In  the 
case  of  an  automatic  link  it  implies  that  access  paths  are 
constructed  which  relate  to  appropriate  records  of  X  and  Y, 
The  definition  results  only  in  storing  the 


predicate 


a 


definition  of  the  link  in  suitable  form.  The  link  can 
us^d  afterwards,  but  we  do  not  expect  any  fast  connect! 
since  the  access  paths  are  not  built  yet.  The  link  creat 
establishes  the  fast  connections-  The  following  comm 
applies  only  to  automatic  links. 


' R EA.T E  LINK  <link  name>  FROM  <record  name>  TO  <record 


Notice  that  the  creation  is  unidirectional.  Namely  e 
direction  of  the  link  has  to  be  created  independent 
There  is  no  reason,  for  instance,  to  construct  both  acc 
paths  unless  they  are  taken  equally  often,  or  one  comes 
free  according  to  the  implementation. 


In  the  case 
explicitly  by  the 


of  a 
command 


manual 


link  records 


are  lin 


CONNECT  <link  narae>  CURRENT 


<record  name>  TO  CURRENT 


<record  name> 


There  are  implicit  currency  indicators  in  the  sys 
which  specify  which  records  are  current  of  the  first 
second  record  type.  We  will  not  elaborate  very  much 
currency  indicators  because  we  plan  to  limit  our  attent 
eventually  to  only  automatic  links. 

We  also  define  operators  Px  which  operate  on  a  sin 


recor 

d  type  X.  The  result 
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an  operator 
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of 
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on 
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gle 
se  t 
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an 

the 
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restriction  of  a  relation  and  other  similar  operations  can 
be  considered  as  operators.  We  can  define  and  create  an 
automatic  operator  by: 

DEFINE  AUTOMATIC  OPERATOR  <operator__n am e>  ON  <record_name> 
DEFINED  BY  <definition> 


CREATE  OPERATOR  <oper ator__name> 


The  definition  of  an  operator,  as  in  a 
automatic  link,  will  use  item  names  and/or 
The  creation  of  the  operator  will  result 
data  structure  which  will  enable  a  user  to 
the  subset  specified  by  the  operator. 


definition  of  an 
free  variables, 
in  an  additional 
quickly  obtain 


We  can  also  have  a  manual  operator  analogous  to  the 


def inition 

of  manual 

1 inks. 

Namely  we  can  define  a 

manual 

operator 

and  then 

create 

it  a 

record  at  a  time. 

We  need 

again  to  u 

se  a  curren 

cy  indie 

ator 

to  specify  which 

record 

should  be  included  in  the  operator. 


DEFINE  MANUAL  OPERATOR  <opera tor_n ame>  ON  <record_name> 
INCLUDE  CURRENT  OF  <record_na mG>  IN  <operat or_na me> 


All 

command 

s  are  matched 

by 

corresponding  dest 

roy 

commands. 

We  can 

DESTROY  links 

and 

operators.  We 

ca  n 

DISCONNECT 

and 

EXCLUDE  entries 

from  manual  links 

and 

operators. 

It  is 

also  wise  to  be 

able 

to  UNDEEINE  links 

and 

operators,  when  their  usefulness  goes  away. 

I 

We  will  now  define  expressions  of  operators  and  links 
with  arguments  record  types.  Each  expression  will  represent 
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a  set 
formed 

1) 


of  records  of 
by  the  following 
An  expression  Ex 


a  specific  type, 
rules 

of  type  X  is  X 


The  expresions  are 


2)  An  expression  Ex  of  type  X  is  an  operator  Px  on  Ex, 
Px*Ex 

3)  An  expression  Ey  of  type  Y  is  a  link  Lxy  from  X  to  Y 

operating  on  an  expression  Ex  of  type  X,  Lxy  *  Ex 

4)  An  expression  Ex  of  type  X  is  a  union  of  two 

expressions  of  type  X 

5)  An  expression  Ex  of  type  X  is  an  intersection  of  two 
expressions  of  type  X 

6)  An  expression  Ex  of  type  X  is  the  difference  between 
two  expressions  of  type  X 

7)  All  expressions  are  obtained  only  with  rules  1  to  6. 
Parentheses  are  sometimes  used  to  sepcify  the  order 
of  evaluation. 

For  instance  an  expression  can  be: 

Lyz*  {Lxy*Px*XUPy*Y)  0Lxz*X 

where  Lyz,  Lxy,  Lxz  are  links  and  Px,  Py  are  operators. 

The  semantics  of  the  expression  is: 

1)  Take  all  records  of  type  X  and  map  them  with  Lxz  to  a 
set  A  of  records  Z 


1 1 


2) 

Take 

a 

11 

re 

cord 

s  of  type 

Y 

and  restr 

ict  t 

he 

m  w 

it  h 

ope  rat 

or 

Py 

/  c 

all 

set 

B 

3) 

Take 

a 

11 

re 

cord 

s  oi 

;  type 

X, 

restrict 

them 

wi 

th 

Px, 

and  ma 

P 

the 

re 

suit 

to 

reco  rd 

s  Y 

with  Lxy 

,  cal 

1 

set 

C 

4) 

Take  u 

ni 

on 

of 

C  an 

d  B, 

call 

set 

D 

5)  Map  D  on  records  of  type  Z  using  Lyz,  call  set  E 

6)  take  union  of  h  and  call  it  F 

7)  The  given  expression  represents  the  records  in  set  F 
of  type  Z. 

Expressions  can  be  defined  and  created  in  much  the  same 
way  as  links  and  operators 

DFFINE  EXPRESSION  <ex pr ession_naffle> 

DEFINITION  <ex  pression^def ini tion> 

CRFA.TE  EXPRESSION  <expr ess ion_name> 


In  order  to  define  an  expression,  all  the  links  and 
operators  used  in  its  definition  have  to  be  defined.  In 
order  to  create  an  expression,  all  the  links  and  operators 
in  its  definition  need  not  be  created  in  advance,  but  the 
operation  will  be  rather  slow  if  they  are  not- 

We  claim  that  by  using  links,  operators  and  their 
expressions  we  have  a  complete  language  to  operate  on 
records.  We  will  expand  on  this  in  section  4  and  5. 
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E5L££.§§.§i.I19.  structure  with  links. 

Many  existing  data  base  management  systems  view  recoLds 
as  related  according  to  explicit  structures.  The  tho  main 
points  of  view  are: 

1)  hierarchical  systems,  such  as  IMS,  GIS,  TDMS,  SYSTEM 
2000 

2)  network  systems,  such  as  IDS,  EDMS,  TDMS,  TOTAL 

We  will  discuss  these  two  approaches  to  data  base 
management  and  how  they  relate  to  links. 


A 

hierarchical  d 

ata  base 

i 

s  compos 

ed 

of  a 

set  of 

re 

c  ord 

types  ( 

sometimes  refe 

red  to  a 

s 

segment 

types) 

[ IMS  ]. 

The 

record 

types  are  rel 

ated  by 

a 

de  f initi 

cn 

tree , 

,  Each 

br 

a  nch 

in  the 

def initi on 

tree  re 

pr 

esents 

an 

1 :  N 

rela  ti 

on 

shi  p 

between 

the  father 

type  a 

nd 

the  son 

type. 

The  dat 

a 

base 

consists  of  a  set  of  record  occurrences  which  are  structured 
in  trees  generated  according  to  the  definition  tree.  Users 
traverse  the  record  occurrence  data  base  trees  and  pick 
appropriate  records.  This  structure  can  be  explained  and 
represented  with  links,  but  with  several  restrictions.  A 
hierarchical  data  base  is  a  collection  of  record  types  and 
links  between  them  which  follow  the  following  constraints. 

HI)  Function ality .  Every  link  permitted  is  functional 
in  at  least  one  direction.  For  every  Lxy,  its 

inverse  is  a  total  function.  That  is,  each  x  can 


P2) 


1  3 

be  related  to  many  y's  but  each  y  is  related  to 
exactly  one,  unique,  x. 


No  Lxx,  A  link  cannot  be  defined  on  the  same  record 
type. 


^3)  Unignessess  of  Lxy.  For  each  pair  of  record  types  X 
and  Y  there  is  at  most  one  link  Lxy. 

R4)  Tree  structure.  If  we  represent  each  record  type  X 
as  a  node  and  the  links  between  record  types  as 
branches,  then  the  links  form  a  tree  structure. 
There  is  no  node  which  is  not  connected.  This  rule 
implies  that  each  record  type  is  linked  to  one 
parent  record  type  and  may  be  to  one  or  more  son 
record  types. 


R5) 


Only 

man  ual . 

All  links 

are  man 

ual. 

Note 

that  this 

refers  to  our 

definition 

of  man 

ual 

operation.  In 

t  he 

DBTG  sense  these 

1  inks 

will 

not 

be  called 

manual. 


If  we  have  restrictions  according  to  rules  R1,  R2,  R3, 
R4  then  each  record  occurrence  has  to  be  linked  to  one  and 
only  one  record  occurrence  of  the  father  type.  This  enables 
us  to  define  a  combined  operation  (insert,  link)  by  pointing 
to  one  record  occurrence  and  inserting  a  record  as  a  son. 
This  operation  only  applies  to  hierarchical  systems.  In 
addition,  when  we  delete  a  record  all  records  connected  to 
it,  except  its  parent,  should  also  be  deleted.  This  follows 
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from  the  fact  tha^^  a  record  occurrence  cannot  exist  without 
being  linked  to  a  father. 

network  data  base  as  proposed  by  DBTG,  is  composed  of 
a  set  of  record  types  with  links  having  the  following 
re  str ict ions. 

R2)  No  Lxx.  This  rule  corresponds  to  the  restriction  in 
DBTG  of  having  a  different  owner  type  and  member 
type{s)  for  each  set  type  [Codasyl  DBTG], 


^6)  functionality.  For  every  link  Lxy,  its  inverse 

is  a  partial  function.  That  is  each  x  can  be 
related  to  many  y’s,  but  each  y  is  related  to  at 
most  one  (possibly  none)  x.  This  rule  corresponds 
to  the  restriction  in  DBTG  allowing  each  record 
occurrence  to  belong  to  no  more  than  one  set 
occurrence  of  a  certain  type.  Notice  the 
difference  between  rule  FI  (functionality)  and  rule 
R6  (weak  functionality) ,  stemming  from  the  optional 
feature  of  DBTG  [Codasyl  DBTG]. 


The  definition  of 
"automatic”  definition 
automatic  definition, 
corresponding  set  has  to 
less  manually  by  curren 
facility  of  linking  many 
defining  a  set  of  links 
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If  W‘^  view  all  these  links  as  one  chain,  we  have  the 
equivalent  of  an  owner  coupled  set. 

The  DBTG  proposal  allows  the  possibility  to  structure 
links  in  a  network.  Such  a  facility  is  sometimes  called 
structurally  complete.  It  is  obvious  that  our  model  using 
links  is  structurally  complete.  In  addition,  unlike 
hierarchical  systems,  network  systems  allow  the  ability  to 
define  many  different  links  between  two  record  types.  In 
order  to  access  records  in  the  data  base,  users  have  to 
navigate  [Bachman  1973]  according  to  the  links  defined 
between  records.  Operations  are  effected  cn  one  record  at  a 
time.  This  situation  implies  the  use  of  complicated 
currency  indicators.  We  represent  with  link  expressions 
sets  of  records.  Hence  we  avoid  many  instances  of  currency 
indicators  at  this  level  of  the  system. 

The  DBTG  proposal  includes  many  other  considerations 
like  DHL,  implementation  hints,  security  provisions,  etc. 
[Codasyl  DBTG],  We  are  only  looking  at  the  data  structuring 
facility, 

h  relational  system  is  not  exactly  mirrored  by  a  set  of 
links  on  record  types.  However,  if  we  consider  a  relational 
implementation,  we  can  see  clearly  the  role  of  links  among 
record  types.  Consider  a  set  of  record  types  corresponding 
to  a  set  of  primary  relations.  The  ability  to  derive 
relations  according  to  the  relational  algebra  is  essential 
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and  constitutes  a  key  par-*-  of  a  relational  system.  Consider 
for  example  the  join  of  two  relations  R  and  S.  Suppose  R  is 
implemented  by  the  record  type  X  and  S  by  the  record  type  Y. 
It  is  easy  to  see  that  we  can  define  a  link  Ixy  among  X  and 
Y  which  expresses  the  join.  Thp  link’s  predicate  definition 
will  correspond  to  the  condition  among  domains  for  the  join 
[Codd  1^70].  Using  the  link  we  can  relate  domains  of  R  in 
the  join  with  domains  of  S  in  the  join  and  vice  versa. 
Hence  the  link  implements  the  join. 

We  will  now  define  a  framework  of  links  which  we  feel  is 
both  adequate  and  appropriate.  We  will  restrict  the  links 
to  specific  kinds.  The  main  guidelines  for  restrictions 
will  be  in  decreasing  priority; 

1)  Ease  of  use,  economy  of  concept 

2)  Implementation  use  and  efficiency 

3)  Compatibility  with  existing  systems  (i.e.,  historical 
reasons) 

We  will  try  to  specify  the  reasons  why  particular 
restrictions  are  made. 

R6)  Weak  functionality.  This  restriction  is 
particularly  important  in  DBTG.  During  navigation 
we  want  at  any  record  occurrence  to  have  a  unique 
set  occurrence  of  a  certain  type  and  therefore  a 
unique  owner  record  occurrence  for  this  type.  In 
our  framework  it  is  not  absolutely  necessary. 
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allow  manual  links.  The 
disadvantages  of  manual  links  are  connected  with 
the  encoding  of  information  in  the  structure  of  the 
data.  We  feel  that  there  are  serious  drawbacks  in 
this  operation  which  hamper  data  independence  [  Date 
and  Codd  ].  I^anual  links  can  still  be  constructed 
explicitly  declaring  an  intermediate  record  type 
and  inserting  appropriate  records  in  it.  Assume, 
for  instance,  that  x  in  X  and  y  in  Y  need  to  be 
linked  manually.  Tn  the  intermediate  record  type 
we  insert  a  record  with  the  keys  of  x  and  y.  Two 
automatic  links  according  to  the  keys  relate  X  and 
Y  with  the  intermediate  record  type.  Hence,  by 
following  these  two  links  we  can  effectively  get 
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from  X  to  y.  This  method  is  often  used  in 
relation el  systems  for  connecting  arbitrary  tuples 
which  have  no  relationship  according  to  domain 
values. 


Note  that  by  allowing  only  automatic  links  we  rule  out 
structural  links  which  represent  themselves  extra 
information.  Such  links  have  advantages  for  implementation 
purposes.  For  instance,  when  implementing  a  hierarchical 
data  base  with  automatic  links  alone,  we  need  to  duplicate 
many  keys,  in  the  record  types.  We  feel  that  the  underlying 
system  implementing  our  language  can  use  manual  links  for 
efficiency  purposes.  The  important  restriction  is  that  they 
are  not  available  at  the  user’s  level.  In  this  manner  their 
use  is  transparent  to  the  user  and  carefully  restricted. 
The  situation  is  analogous  to  ruling  out  GO  TO*s  from  a  high 
level  language.  Branch  instructions  still  exist  in 
hardware,  but  their  use  is  not  allowed  to  the  user. 


We  do  not  restrict  links  in  any  other  way.  The  ability 
to  handle  Lxx  links  is  very  important.  It  is  recognized 
even  in  the  framework  of  DB'^G .  In  a  relational  system  there 
are  operations  which  relate  tuples  of  the  same  relation.  A 
typical  example  is  the  relationship  managed  by^  in  an 
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Operators  on  record  types  are  important  tor 
qualification,  holding  intermediate  results  and  other 
operations.  They  are  particularly  important  in  a  system 
which  handles  more  than  one  record  at  a  time.  A.fter  we 
isolate  an  interesting  set  of  records  using  operator  link 
expressions,  we  still  have  to  pass  the  appropriate  records 
to  the  host  programming  language,  or  query  system.  For  this 
operation  we  may  need  to  obtain  one  record  at  a  time.  We 
outline  a  simple  facility  for  this  situation. 

For  every  record  type,  every  operator  created  and  every 
expression  created,  we  have  one  currency  indicator  in  the 
system.  This  currency  indicator  is  set  using  the  following 
comman  d 

GET  <NEXT 1  PRIOR) FIRST  1 LAST>  OF  <rec crd_na me J opera tor_name ) 

expression_narae> 

The  result  of  the  command  is  to  retrieve  only  one  record  and 
update  the  currency  indicator.  Note  that  we  do  not  use  a 
currency  indicator  for  links.  They  represent  a  mapping  and 
not  a  set  of  records.  Also  note  that  a  currency  indicator 
is  only  needed  for  an  operator,  or  expression  after  they  are 
created. 


We  allow  the  use  of  manual  operators  although  we  forbid 
the  use  of  manual  links.  This  seems  contradictory,  but 
manaal  operators  are  useful  in  bookeeping  operations  of  the 
system  and  for  holding  intermediate  results.  A  manual 
operator  can  now  be  changed  according  to  the  current  record. 
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not  only  of  a  record  type,  but  of  another  operator 
expression.  Hence  the  TNCLT7DE,  EXCLUDE  command 

operators  is  more  general. 

INCLUDE  CURRENT  OF  <RECOPD| OPERATOR | EXPRESSION>  name 
IN< opera tor _name> 

We  did  not  mention  so  far  any  commands  for  update,  ins 
and  delete  of  records.  We  do  not  allow  insert,  delete 
update  on  anything  else  except  declared  record  types, 
commands  have  the  following  format: 

UPDATE  CURRENT  OE  < RECORD  1  OPERATOR  1 EXPRESSION><name> 
WITH<odif icat ion> 

INSERT  RECORD  IN<record__name>FROM<workin g  area> 

DELETE  CURRENT  OF  <RECORD ] OPE  RATO R 1 EXPRESSION><naffie> 


Note  that  the  result  of  an  update,  or  delete  will  affec 
particular  record  of  the  record  type  specified  in 
command.  Affected  created  expressions  and  operators 
modified  accordingly  in  an  automatic  manner. 

6.  Effectiveness  of  the  framework 


All  operations  on  record  types  defined  in  the  previ 
sections  have  as  a  result  sets  of  records  of  the  origi 
record  types.  An  operator  or  an  expression  represents  a 
of  records  of  a  particular  type  X.  We  cannot  theref 
claim  that  we  have  completeness  in  the  relational  alge 
sense  [Codd  1971],  For  instance,  consider  two  relations 
and  Ry  represented  by  the  record  types  X  and  Y.  The  join 
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fc’e  claim  that  expressions  are  complete  in  a  different 
sense.  Consider  a  set  of  record  types  <X,Y,...>.  Consider 
all  primary  relations  that  they  represent  <Rx,Ry,...>. 
Consider  the  closure  of  these  relations  according  to  the 
relational  algebra.  Any  relation  R  obtained  from 
<Rx,Ry,...>  according  to  the  relational  algebra  will  have  as 
domains  some  of  the  domains  of  <Rx,Ry,...>.  The  relation  R 
will  represent  some  relationship  between  data  elements  of 
these  domains.  We  claim  that  any  such  relationship,  for  any 
R,  can  be  represented  by  an  expression  in  the  following 
sense . 

Consider  two  domains  of  R  (A,B) .  Consider  any  value  <a> 
of  A.  Under  R  there  will  be  a  corresponding  set  of  values 
(possibly  empty)  for  B,  <  b  1 ,  b2,  b3  , ,  . .  > ,  Me  claim  that  given 
the  value  <a>  we  can  obtain  an  expression  E  which  contains 
all  the  tuples  of  the  appropriate  record  type  which  have 
values  for  domain  B  <b 1 , b2, b3 , . . . >.  Hence  we  can  represent 
the  relationship  value  <a>  goes  to  <b1 , b2, b3 , . . . >  by 
constructing  the  appropriate  expression  and  extracting  from 
its  records  the  values  for  domain  B  <b  1 , b 2, b 3 , . . . > .  It  is 
obvious  that  the  same  applies  for  a  set  of  values 
<a1,a2,...>,  since  expresj^ions  are  closed  under  union. 
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Wp  will  try  to  justify  our  claim.  Considering  the 
informal  nature  of  our  argument,  the  word  proof  is 
inappropriate.  Nevertheless,  we  feel  that  all  definitions 
can  be  made  very  formal  and  a  proof  can  be  obtained.  We 
will  consider  the  operations  of  the  relational  algebra 
individually. 

1)  Union,  Intersection,  Difference  and  Projection.  In 
the  firs+  three  cases  we  have  no  problem  since  the 
operations  are  also  present  in  expressions.  Projection 
is  not  an  issue  because  of  the  way  we  claim  our 
c  ompleten  ess. 

2)  Cartesian  product  of  R,S 

The  cartesian  product  of  two  record  types  X,Y  can  be 
obtained  with  a  link  which  has  as  definition  a  predicate 
always  true.  This  will  link  every  record  of  X  with 
every  record  of  Y. 

3)  Join  of  R,S 

The  join  of  two  relations  represented  by  X,Y  can  be 
obtained  using  a  link  which  relates  appropriate  records 
X  and  y.  The  predicate  describing  the  join  will  be  used 
in  the  definition  of  the  link. 

4)  Restriction  of  P 

The  restriction  of  a  relation  R  corresponding  to  a 
record  '•■ype  X  can  be  represented  as  an  operator  on  X. 


records  which 
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The  operator  will  isolate  only  those 
qualify  according  to  the  restriction. 

5)  Division  of  R 
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We  did  not  in  any  way  complete  the  argument,  since  we 
did  not  show  that  it  applies  to  operations  on  relations 
which  do  not  correspond  to  record  types.  We  feel  though 
that  the  same  mechanism  will  apply,  only  the  expressions 
will  get  longer. 
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Besides  completeness,  there  are  other  operations  which 
should  be  provided  to  make  the  framework  appropriate.  The 
currency  indicators,  manual  operators  and  GET  NEXT  commands 
pres9n+-ed  in  section  5  are  particular  examples. 


Note 


t  ha  t 
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thes0  currency  indicators  and  manual  operators  are  not 
exactly  enough  to  navigate  in  the  DBTG  sense  [Bachman  1973]. 
Manual  navigation  was  excluded  on  purpose.  These  facilities 
were  only  included  for  ease  of  output.  Other  suggestions  of 
the  same  type  are: 

1)  Define  operators  with  free  variables  which  can  be 
created  by  fixing  the  free  variables.  For  example.  Pa 
according  to  the  value  of  a  data  item  A.  For  every 
value  <a>  of  A,  Pa  represents  the  subset  of  records  with 
A=a. 

2)  Create  a  new  record  type  according  to  an  expression. 
This  will  produce  an  independent,  time  invariant 
snapshot  of  the  set  of  records  represented  by  the 
expression. 

3)  Create  a  new  record  type  from  a  utility.  This  is 
useful  for  loading. 

Note  that  links  and  operators  can  also  represent 
inverted  files.  Consider  an  inverted  file  inverting  values 
of  domain  A  of  a  record  type  X.  We  can  view  the  inverted 
file  as  a  new  record  type  Y  which  contains  only  the  domain 
A.  There  is  an  operator  Pa  defined  on  Y  which  for  any 
value  a  of  A  gives  the  appropriate  record  of  Y.  This 
operator  is  a  free  variable  operator  which  is  very  fast  to 
compute  (using  ordering  or  hashing).  The  inverted  file  is 
then  represented  by  a  link  Lyx  connecting  records  of  type  Y 
and  X  which  have  the  same  values  of  A.  Given  a  value  a  of  A 
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the  records  of  X  which  have  this  value  are  represented  by 


In  this  manner  we  can  treat  inverted  files  as  fast 
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Implementation  of  links  is  a  problem  similar  to  the 
implementation  of  DBTG  sets  [Codasyl  DBTG  ].  We  use  pointer 
arrays  in  a  manner  similar  to  the  one  proposed  by  DBTG,  We 
call  such  pointer  arrays  basic  They  have  been 
proposed  in  [Tsichritzis  1974]  and  implemented  in  the  ZETA 
project  [Czarnik  1974,  Brodie  et  al  1974], 

A  basic  el21§nt.  is  a  data  structure  consisting  of  a 
header  and  a  body.  The  header  is  a  small,  fixed  size, 
descriptor  which  serves  as  a  control  block  to  an  ordered  set 
of  slots.  The  number  of  slots  (i.e,,  size  of  the  body)  can 
vary.  Each  slot  is  of  fixed  size  and  may  or  may  not  be 
empty.  A  slot’s  contents  are  not  necessarily  unique.  It 
will  usually  be  interpreted  as  an  offset  into  a  conventional 
file,  or  a  pointer  to  a  record  in  an  ordered  set  of  records, 
or  an  index  to  another  basic  element.  A  basic  element  will 
usually  be  interpreted  as  a  set  of  pointers,  or  pointer 
array . 

There  is  a  set  of  standard  commands  which  can  be  used  to 
manipulate  basic  elements  [Czarnik  1974],  Using  such 
commands  one  can  create  or  destroy  basic  elements,  increase 
or  decrease  the  body  size  and  search  or  change  the  values  in 
the  slots.  We  will  not  discuss  the  function  and  format  of 
the  commands  in  detail. 

Basic  elements  can  be  combined  serially  or  in_pa rallel . 
Serial  combination  refers  to  concatenation  of  the  pointer 


27 


arrays  represented  by  the  bodies  of  the  basic  elements. 
Parallel  combination  refers  to  considering  the  corresponding 
(same  order)  slots  of  basic  elements  in  parallel.  The  order 
of  slots  gives  the  association  between  elements  pointed  to 
by  the  pointers  in  the  slots,  much  like  a  transposed  file. 
Serial  and  parallel  composition  of  basic  elements  can  be 
used  to  represent  associations  between  record  types 
[Tsichritzis  197ii,  Brodie  et  al  1974].  We  use  them  to 
implement  links  and  operators  on  record  types. 


^n  operator  on  a  record  type  can 
basic  element  or  a  serial  combination, 
pointer  to  a  record  which  is  selected 
record  type  name  and  the  definition 
encoded,  or  pointed  to,  in  the  header 


be  represented  by  one 
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by  the  operator.  The 
of  the  operator  are 
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A  link  can  be  represented  in  many  different  ways 
depending  on  its  characteristics.  Consider  for  instance  a 
1:1  link.  Namely,  both  Lxy  and  Lyx  are  functional.  Such  a 
link  can  be  represented  by  two  basic  elements  considered  in 
parallel.  The  corresponding  slots  point  to  the 
corresponding  records  of  the  record  type  according  to  the 
link. 


Consider  an  1;N  link.  In  the  case  where  N  is  bounded 
and  small,  we  can  still  use  a  small  number  (i-e.,  N+1)  of 

basic  elements  combined  in  parallel.  One  basic  element  is 
used  in  one  direction  (functional)  of  the  link  and  N  for  the 
reverse  direction.  Tn  the  case  where  N  is  rather  large  and 
unbounded,  we  need  a  basic  element  per  owner  record 
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The  hardest  problems  in  the  implementation  of  links  and 
operators  do  not  arise  in  their  representation  but  from 
their  dynamic  nature.  Automatic  links  and  operators  after 
creation  have  to  reflect  the  data  values  present  in  the  data 
base.  Therefore  updates,  insertions  and  deletions  in  a 
record  type  directly  affect  the  automatic  operators  and 
links.  Inconsistencies  pose  serious  problems.  The  access 
paths  have  to  reflect  the  data,  since  we  do  not  use  manual 
links.  We  will  decompose  the  problem  into  four  subproblems 
and  give  some  hints  of  possible  solutions. 

Negative  aspect  of  update  .  When  a  record  is  updated 
we  may  have  to  drop  some  of  its  associations  according 
to  operators  or  links,  because  it  does  not  gualify. 
First,  we  can  check  which  links  or  operators  are 
affected  by  comparing  the  definitions  and  the  updated 
data  items.  Second,  we  can  follow  the  inverse  links  (if 
they  exist)  and  correct  the  pointers.  A  very  different 


solution  is 


to  postpone  the  corrective  action  for 
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later.  We  accept  an  inaccurate  link  and  we  correct  it 
when  we  need  it  (or  when  we  have  the  time).  This  can  be 
done  by  checking  the  definition  every  time  we  follow  a 
link,  or  apply  an  operator.  Incorrect  associations  can 
be  disregarded.  If  we  time  stamp  each  update  and  the 
generation,  or  organization  of  a  link,  then  we  can 
compare  the  time  stamps  to  decide  if  any  corrective 
action  is  necessary. 


2)  £ositive  aspect  of  update.  When  a  record  is  updated 
we  may  have  to  add  some  new  associations  according  to 
operators,  or  links.  This  is  by  far  the  hardest 
problem.  If  we  have  the  definition  of  the  link,  or 
operator  in  closed  form  as  an  equation  of  data  item 
names  and  values,  we  may  be  able  to  solve  the  equation. 
This  way  we  know  the  data  item  values  of  records  which 
have  to  be  associated  according  to  the  updated  record. 
We  could  then  use  inverted  files  to  obtain  the 
associated  records  and  represent  the  new  associations. 
In  order  to  follow  such  a  solution  some  restrictions 
have  to  be  made  in  the  definition  of  links.  This  is 
realistic,  because  the  alternative  is  very  costly. 
Namely  for  an  arbitrary  link  or  operator  we  essentially 
have  to  recreate  them  completely  or  partially  every  time 
there  is  an  update.  Alternatively  we  can  postpone  the 
recreation  until  they  are  used. 


3)  Insertion .  When  a  new  record  is  inserted  we  need  to 
associate  it  in  links,  or  include  it  in  operators.  This 
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action  can  he  taken  at  insertion  time  by  looking  at  the 
definitions  and  possibly  solving  for  the  right  data 
items.  Alternatively,  insertions  operators  and  links 
can  be  time  stamped.  When  a  link  or  operator  is  applied 
then  the  system  checks  dynamically  the  new  insertions. 
The  interesting  insertions  are  the  ones  occurring  after 
the  creation,  or  reorganization  of  the  link  or  operator. 


When  a  record  is  deleted  its  associati 
have  to  be  taken  out  from  the  links  or  operators  t 
are  affected.  This  can  be  done  by  following  inve 
links,  or  looking  at  the  operator’s  definiti 
Alternatively,  we  can  leave  the  record  in  its  place 
flag  it  void.  This  way  when  a  link  or  operator  conne 
to  it,  it  will  be  discarded.  Reorganization  occurs 
collect  the  deleted  records  and  change  the  link 
operator  implementations. 
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8 .  Concluding,  remarks 

We  have  presented  a  framework  for  implementation  of 
relations.  Many  decisions  are  still  open.  We  have 
concentrated  on  the  mechanisms  and  not  their  optimization. 
We  feel  a  network  framework  can  be  very  appropriate  for 
relation  implementation.  A  hierarchical  framework  is  rather 
constrained.  Only  certain  relations  can  be  represented 
without  data  redundancy. 

We  have  already  implemented  a  relational  prototype 
system  [Brodie  et  al].  It  incorporates  some  of  the  ideas 
presented  in  this  paper.  We  hope  to  build  another  system 
which  will  be  even  closer  to  the  network  framework  outlined 
in  this  paper.  We  do  not  claim  to  have  solved  the  relation 
implementation  problems.  We  only  have  an  idea  where  some  of 
them  will  appear  in  our  particular  framework.  It  is  also 
important  to  note  that  what  we  have  proposed  is  far  from  a 
complete  design.  Many  details  have  to  be  worked  out  to 
ensure  an  efficient  system.  We  hope  that  our  linguistic  and 
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efficient  implementation. 
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